Publication


A Novel Bangla Font Recognition Approach Using Deep Learning


Md. Majedul Islam, A K M Shahariar Azad Rabby, Nazmul Hasan, Jebun Nahar, Fuad Rahman
Accepted to be presented at IEMIS 2020: International Conference on Emerging Technologies in Data Mining and Information Security, 2nd - 4th July, 2020, Kolkata, India

Description
Font detection is an essential preprocessing step for printed character recognition. In this era of computerization and automation, computer composed documents such as official documents, bank checks, loan applications, visiting cards, invitation cards, educational materials, etc. are used everywhere. Beyond just editing and processing documents, converting documents from one format to another, such as an invitation card, billboards, etc., is another major application area where a designer has to recognize the font details from the images. There is a lot of re-search on automatic font detection published for high resource languages such as English. Still, not much has been reported for a low resource language such as Bangla. Bangla has a complex structure because of the use of diacritics, com-pound characters, and graphemes. Furthermore, because of the popularity of digital, online publications, there has been a recent surge of fonts in Bangla. Font detection can also help analysts detect changes in font choices based on socio-political divides: for example, consider that fonts common in Bangladesh may not be as popular among Bangla publications in India. In this paper, we present a Convolutional Neural Network (CNN) approach for detecting Bangla fonts, using a space adjustment method dependent on a Stacked Convolutional Auto-Encoder (SCAE). As part of the work, we built a large corpus of printed documents consisting of 12,187 images in 7 different Bangla fonts, forming a total of 77,728 samples by augmentations to train and validate our model. Our pro-posed model achieves 98.73% average font recognition accuracy in the validation set